PERT – Perfect Random Tree Ensembles
نویسندگان
چکیده
Ensemble classifiers originated in the machine learning community. They work by fitting many individual classifiers and combining them by weighted or unweighted voting. The ensemble classifier is often much more accurate than the individual classifiers from which it is built. In fact, ensemble classifiers are among the most accurate general-purpose classifiers available. We introduce a new ensemble method, PERT, in which each individual classifier is a perfectly-fit classification tree with random selection of splits. Compared to other ensemble methods, PERT is very fast to fit. Given the randomness of the split selection, PERT is surprisingly accurate. Calculations suggest that one reason why PERT works so well is that although the individual tree classifiers are extremely weak, they are almost uncorrelated. The simple probabilistic nature of the classifier lends itself to theoretical analysis. We show that PERT is fitting a continuous posterior probability surface for each class. As such, it can be viewed as a classification-via-regression procedure that fits a continuous interpolating surface. In theory, this surface could be found using a one-shot procedure.
منابع مشابه
The Utility of Randomness in Decision Tree Ensembles
The use of randomness in constructing decision tree ensembles has drawn much attention in the machine learning community. In general, ensembles introduce randomness to generate diverse trees and in turn they enhance ensembles’ predictive accuracy. Examples of such ensembles are Bagging, Random Forests and Random Decision Tree. In the past, most of the random tree ensembles inject various kinds ...
متن کاملTree Space Prototypes: Another Look at Making Tree Ensembles Interpretable
Ensembles of decision trees have good prediction accuracy but suffer from a lack of interpretability. We propose a new approach for interpreting tree ensembles by finding prototypes in tree space, utilizing the naturally-learned similarity measure from the tree ensemble. Demonstrating the method on random forests, we show that the method benefits from two unique aspects of tree ensembles by lev...
متن کاملInterpreting Tree Ensembles with inTrees
Tree ensembles such as random forests and boosted trees are accurate but difficult to understand, debug and deploy. In this work, we provide the inTrees (interpretable trees) framework that extracts, measures, prunes and selects rules from a tree ensemble, and calculates frequent variable interactions. An rule-based learner, referred to as the simplified tree ensemble learner (STEL), can also b...
متن کاملNaïve Bayes Ensembles with a Random Oracle
Ensemble methods with Random Oracles have been proposed recently (Kuncheva and Rodŕıguez, 2007). A random-oracle classifier consists of a pair of classifiers and a fixed, randomly created oracle that selects between them. Ensembles of random-oracle decision trees were shown to fare better than standard ensembles. In that study, the oracle for a given tree was a random hyperplane at the root of ...
متن کاملInvestigation of Property Valuation Models Based on Decision Tree Ensembles Built over Noised Data
The ensemble machine learning methods incorporating bagging, random subspace, random forest, and rotation forest employing decision trees, i.e. Pruned Model Trees, as base learning algorithms were developed in WEKA environment. The methods were applied to the real-world regression problem of predicting the prices of residential premises based on historical data of sales/purchase transactions. T...
متن کامل